Wildfire by day for different zips codes

First we can look at the concentrations in each zip code and day.

Cases by day for different zips codes

Next we can look at the cases in each zip code and day.

Defining the wildfire Period

Next we’ll classify each day-zip combination as experiencing some wildfire exposure or not. Wildfire period is defined as any day where the zip code has a PM2.5 value greater than zero (note: some are very low). See how the WF period(s) is/are more clearly the second plot which uses a threshold of 1, instead of 0.

Calculating RR

Now we can begin to calculate RR, first for the entire zip codes (not race stratified). We’ll use a constant of o.1 to fill in spots where there are no cases during the exposure or control periods.

constant <- 0.1

rates_overall <- foo %>%
  mutate(total = white + hispanic + black + native_am + asian_pi + other + missing) %>%
  group_by(zip, wildfire) %>% 
  summarise(cases = sum(total),
            days = length(svc_from_dt)) %>%
    mutate(casesPerDay =  cases/days)

RR_overall <- foo %>%
  mutate(total = white + hispanic + black + native_am + asian_pi + other + missing) %>%
  group_by(zip, wildfire) %>% 
  summarise(cases = sum(total),
            days = length(svc_from_dt)) %>%
    mutate(cases = ifelse(cases == 0, constant, cases),
      casesPerDay = cases/days) %>%
    select(zip, wildfire, casesPerDay) %>%
    spread(key = wildfire, value = casesPerDay) %>%
    mutate(RateRatio = WFperiod/nonWFperiod,
           ln_RateRatio = log(RateRatio)) %>%
  rename(WF_rate = WFperiod, 
         nonWF_rate = nonWFperiod)

DT::datatable({rates_overall}) %>% DT::formatRound(5,5) 
DT::datatable({RR_overall}) %>% DT::formatRound(c(2:5),3)

Plot Overall rate ratios

## Calculating race-specific RR

Now we can begin to calculate race-specific RR.

RR_race <- foo %>%
  group_by(zip, wildfire) %>% 
  summarise(white = sum(white),
            hispanic = sum(hispanic),
            black = sum(black),
            native_am = sum(native_am),
            asian_pi = sum(asian_pi),
            other = sum(other),
            days = length(svc_from_dt)) %>%
  gather(white, hispanic, black, native_am, asian_pi,other, key = race, value = cases) %>%
    mutate(cases = ifelse(cases == 0, constant, cases),
           casesPerDay = cases/days) %>%
    select(zip, wildfire, casesPerDay, race) %>%
    spread(key = wildfire, value = casesPerDay) %>%
    mutate(RateRatio = WFperiod/nonWFperiod,
           ln_RateRatio = log(WFperiod/nonWFperiod))
  
DT::datatable({RR_race}) %>% DT::formatRound(c(3:6),3)

Look at the distrubutions of race_specific rate ratios, need to look into why so many of these are focused around 1.5…?

Sensitivity to the Constant Value

change the value of the constant used to see how if changes the race-specific rate ratios.

foo %>%
  group_by(zip, wildfire) %>% 
  summarise(white = sum(white),
            hispanic = sum(hispanic),
            black = sum(black),
            native_am = sum(native_am),
            asian_pi = sum(asian_pi),
            other = sum(other),
            days = length(svc_from_dt)) %>%
  gather(white, hispanic, black, native_am, asian_pi,other, key = race, value = cases) %>%
    mutate(cases = ifelse(cases == 0, constant, cases),
           casesPerDay = cases/days) %>%
    select(zip, wildfire, casesPerDay, race) %>%
    spread(key = wildfire, value = casesPerDay) %>%
    mutate(RateRatio = WFperiod/nonWFperiod,
           ln_RateRatio = log(WFperiod/nonWFperiod)) %>%
    ggplot() + geom_boxplot(aes(x= race, y= ln_RateRatio))+ geom_jitter(aes(x= race, y= ln_RateRatio, color=race), alpha=0.2) + guides(color=FALSE)

Merge, Weight, and Model

Combine with other covariates, calculate the weights and